A robust ranking algorithm to spamming
نویسندگان
چکیده
Ranking problem of web-based rating system has attracted many attentions. A good ranking algorithm should be robust against spammer attack. Here we proposed a correlation based reputation algorithm to solve the ranking problem of such rating systems where user votes some objects with ratings. In this algorithm, reputation of user is iteratively determined by the correlation coefficient between his/her rating vector and the corresponding objects’ weighted average rating vector. Comparing with iterative refinement (IR) and mean score algorithm, results for both artificial and real data indicate that, the present algorithm shows a higher robustness against spammer attack. Introduction. – The abundance of available information troubled people every day, and information filtering technique is quickly developed in recent years. An important aspect in information filtering is the rating system. There are a range of daily examples of rating system. Such systems include opinion websites (Ebay, Amazone, Movielens, Netflix, etc.), where users evaluate objects. Ranking is one of the most common way to describe the evaluation aggregation result, which gives a simple representation of the comparative qualities of objects. PageRank is the most widely applied algorithm for search engines which rank websites based on the directed hyperlink graph [1]. Recently, some iterative algorithms are used in scientific citation network to rank scientists [2]. Both the hyperlink network and scientific citation network are unipartite systems, but many other rating systems have a bipartite structure with two kinds of node: users as evaluators and objects as candidates [3]. In this paper, we consider the ranking problem in such rating systems where users vote objects with ratings, and devise algorithms to accurately rate objects. Ranking objects according to their average ratings is a straightforward statistical method. However, in the open evaluation system, the user can be somebody who are not serious about voting, or he/she is not experienced in the corresponding field and gives some unreasonable ratings. What even worse is that the user might be an evil spam(a)E-mail: [email protected] mer who gives biased ratings on purpose. Therefore, the evaluation by simply averaging all ratings may be less accurate. Building a reputation system for users is a good way to solve this problem [4, 5]. Users with higher reputations are assigned more weight. Such reputation mechanisms are widely used in online systems, such as online shops [6], online auctions [7], Wikipedia [8], P2P sharing networks [9], etc. There are already some ranking algorithms based on reputation estimate [10–13]. In [12, 13], an iterative refinement (IR) algorithm is proposed. A user’s reputation is inversely proportional to the difference between his/her rating vector and the corresponding objects’ weighted average rating vector. Weighted rating of all objects and reputation of all users are recalculated at each step, until the change of weighted ratings is less than a certain threshold between two iteration steps. Kerchove and Dooren [11] modify the iterative refinement algorithm by assigning trust to each individual rating. In most previous works, the influence of spammer attack in rating systems is always ignored. In this paper, we proposed a correlation based ranking algorithm. Reputation of user is determined by the correlation coefficient between the user’s rating vector and the corresponding objects’ weighted average rating vector. By comparing with other algorithms, the effectiveness of the correlation based ranking algorithm was tested using artificial data. The results show that correlation based
منابع مشابه
QuickRank: A Recursive Ranking Algorithm
This paper presents QuickRank, an efficient algorithm for ranking individuals in a society, given a network that encodes their relationships, assuming that network possesses an accompanying hierarchical structure: e.g., the Enron email database together with the corporation’s organizational chart. The QuickRank design is founded on the “peer-review” principle, defined herein, and an hypothesis ...
متن کاملDirichletRank: Ranking Web Pages Against Link Spams
Anti-spamming has become one of the most important challenges to web search engines and attracted increasing attention in both industry and academia recently. Since most search engines now use link-based ranking algorithms, link-based spamming has become a major threaten. In this paper, we show that the popular link-based ranking algorithm PageRank, while being successfully used in the Google s...
متن کاملRobust reputation-based ranking on multipartite rating networks
The spread of online reviews, ratings and opinions and its growing influence on people’s behavior and decisions boosted the interest to extract meaningful information from this data deluge. Hence, crowdsourced ratings of products and services gained a critical role in business, governments, and others. We propose a new reputation-based ranking system utilizing multipartite rating subnetworks, t...
متن کاملLink Spam Detection based on DBSpamClust with Fuzzy C-means Clustering
This Search engine became omnipresent means for ingoing to the web. Spamming Search engine is the technique to deceiving the ranking in search engine and it inflates the ranking. Web spammers have taken advantage of the vulnerability of link based ranking algorithms by creating many artificial references or links in order to acquire higher-than-deserved ranking n search engines' results. Link b...
متن کاملDetecting Link Hijacking by Web Spammers
Since current search engines employ link-based ranking algorithms as an important tool to decide a ranking of sites, web spammers are making a significant effort to manipulate the link structure of the Web, so called, link spamming. Link hijacking is an indispensable technique for link spamming to bring ranking scores from normal sites to target spam sites. In this paper, we propose a link anal...
متن کاملGroup-based ranking method for online rating systems with spamming attacks
Ranking problem has attracted much attention in real systems. How to design a robust ranking method is especially significant for online rating systems under the threat of spamming attacks. By building reputation systems for users, many well-performed ranking methods have been applied to address this issue. In this Letter, we propose a group-based ranking method that evaluates users’ reputation...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1012.3793 شماره
صفحات -
تاریخ انتشار 2010